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1. An apparatus for executing an MMX PSADBW instruction, 
comprising: 

subtracters, for generating packed differences of 
packed operands of the instruction and for 
generating carry bits associated with each of the 
packed differences; 

inverters, coupled to said subtraction logic, for 
generating an inverse of each of said packed 
differences; 

multiplexers, coupled to said inverters and said 
subtraction logic, each for selecting as an 
output said packed difference if said associated 
carry bit indicates the packed difference is 
positive, and for selecting as said output said 
inverse if said associated carry bit indicates 
the packed difference is negative; and 

adders, coupled to said multiplexers, for adding said 
carry bits and said outputs of said multiplexers 
to generate a result of the instruction. 
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2. The apparatus of claim 1, further comprising: 

an instruction type input, for specifying whether the 
PSADBW instruction or a multiply instruction is 
being executed by the apparatus; and 

second multiplexers, coupled to said first 
multiplexers, for providing to said adders said 
carry bits and said first outputs of said first 
multiplexers if said instruction type input 
specifies the PSADBW instruction, and for 
providing partial products if said instruction 
type input specifies a multiply instruction. 

3. The apparatus of claim 1, wherein said adders 
comprise : 

first and second adders, for adding first and second 
pluralities of partial products of at least one 
multiply instruction. 

The apparatus of claim 3, wherein said adders further 
comprise : 
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a third adder, coupled to said first and second 
adders, for adding first and second sums 
generated by said first and second adders to 
generate a result of the PSADBW instruction. 

5. The apparatus of claim 4, wherein said third adder is 
also selectively employed to generate a sum of product 
results of said at least one multiply instruction. 

6. The apparatus of claim 4, wherein said first sum 
comprises a sum of said carry bits. 

7. The apparatus of claim 4, wherein said second sum 
comprises a sum of said outputs of said multiplexers. 

3. The apparatus of claim 1, wherein each of said carry 
bits comprises a Boolean zero value if said associated 
packed difference is positive and comprises, a Boolean 
one value if said associated packed difference is 
negative . 

>. The apparatus of claim 1, further comprising: 

a plurality of storage elements, for storing said 
carry bits. 
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10. The apparatus of claim 1, wherein said adders add said 
carry bits and said outputs of said multiplexers 
substantially in parallel. 

11. The apparatus of claim 1, wherein a computer program 
product comprising a computer usable medium having 
computer readable program code causes the apparatus, 
wherein said computer program product is for use with 
a computing device. 

12. The apparatus of claim 1, wherein a computer data 
signal embodied in a transmission medium comprising 
computer-readable program code provides the apparatus. 
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13. A microprocessor for generating a packed sum of 
absolute differences, comprising: 

an instruction translator, for translating an MMX 
PSADBW macroinstruction into at least first and 
second microinstructions; and 

an MMX unit, coupled to said instruction translator, 
for generating a result of said PSADBW 
macroinstruction in response to said at least 
first and second microinstructions. 

14. The microprocessor of claim 13, wherein said MMX unit 
generates packed differences of said operands in 
response to said first microinstruction, and generates 
a sum of absolute values of said packed differences in 
response to said second microinstruction. 

15. The microprocessor of claim 13, wherein said MMX unit 
comprises: 

a plurality of subtracters, for generating said packed 
differences of said operands. 
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16. The microprocessor of claim 15, wherein said plurality 
of subtractors generate said packed differences of 
said operands in a single microprocessor clock cycle, 

17. The microprocessor of claim 15, wherein said plurality 
of subtractors also generate a sign for each of said 
packed differences of said operands. 

18. The microprocessor of claim 13, wherein said MMX unit 
comprises : 

multiplexing logic, having a microinstruction type 
control input, wherein if said control input 
indicates said microinstruction type is of said 
second microinstruction, then said multiplexing 
logic selects selectively inverted said packed 
differences of said operands for providing to an 
adder as a plurality of addends. 

19. The microprocessor of claim 18, wherein each of said 
packed differences is selectively inverted based on 
whether said packed difference is positive or 
negative . 
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20. The microprocessor of claim 19, wherein said packed 
difference is inverted if said packed difference is 
negative and not inverted if said packed difference is 
positive . 

21. The microprocessor of claim 18, wherein if said 
control input indicates said microinstruction type is 
not of said second microinstruction, then said 
plurality of multiplexers select a plurality of 
partial products from a multiplier for providing to 
said adder as said plurality of addends. 
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22. An apparatus for generating a packed sum of absolute 
differences, instruction in a microprocessor having 
subtraction logic for generating differences of packed 
bytes in each of a minuend operand and a subtrahend 
operand of the instruction, the microprocessor also 
having logic for generating partial products of at 
least one multiply instruction, the microprocessor 
also having addition logic for adding the partial 
products, the apparatus comprising: 

a plurality of storage elements, each for storing a 
sign bit for indicating whether a corresponding 
one of the differences is positive or negative; 

a plurality of multiplexers, coupled to corresponding 
ones of said plurality of storage elements, each 
for outputting a value, wherein said value 
comprises said difference if said sign bit is 
positive and a complement of said difference if 
said sign bit is negative; and 
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multiplexing logic, coupled to said plurality of 
multiplexers, for selecting the partial products 
for provision to the addition logic when 
executing the at least one multiply instruction, 
and for selecting said sign bits and said values 
for provision to the addition logic when 
executing the packed sum of absolute differences 
instruction. 

23. The apparatus of claim 22, wherein the addition logic 
adds said sign bits and said values substantially in 
parallel . 

24. The apparatus of claim 22, wherein said sign bits and 
said values comprise at least 16 addends added by the 
addition logic. 
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25- A method for executing an MMX PSADBW instruction, 
comprising: 

generating packed differences of packed operands of 
the instruction and generating carry bits 
associated with each of the packed differences; 

for each of the packed differences, determining 
whether the carry bit indicates the packed 
difference is positive or negative; 

for each of the packed differences, selecting a value 
in response to said determining, said value 
comprising the packed difference if the 
associated carry bit is positive and a complement 
of the packed difference if the associated carry 
bit is negative; and 

adding the values selected and the carry bits to 
generate a result of the instruction. 

26. The method of claim 25, wherein said adding comprises: 

adding the carry bits to generate a first sum; 

adding the values to generate a second sum; and 
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adding the first and second sums to generate the 
result . 

27. The method of claim 25, further comprising: 

determining whether the PSADBW instruction or a 
multiply instruction is being executed; 

said adding the values selected and the carry bits if 
the PSADBW instruction is being executed; and 

adding partial products if the multiply instruction is 
being executed. 

28. The method of claim 25, further comprising: 

storing the carry bits, after said generating the 
carry bits . 

29. The method of claim 25, further comprising: 

translating the PSADBW instruction into at least first 
and second microinstructions, prior to said 
generating. 

30. The method of claim 29, further comprising: 

said generating in response to said first 
microinstruction; and 
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said adding in response to said second 
microinstruction . 

31. The method of claim 25, wherein said selecting is 
performed in parallel for the packed differences. 

32. The method of claim 25, wherein said adding is 
performed in parallel. 
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33. A computer data signal embodied in a transmission 
medium, comprising : 

computer- readable program code for providing an 
apparatus for executing an MMX PSADBW 
instruction, said program code comprising: 

first program code for providing subtracters, for 
generating packed differences of packed 
operands of the instruction and for 
generating carry bits associated with each 
of the packed differences; 

second program code for providing inverters, 
coupled to said subtraction logic, for 
generating an inverse of each of said packed 
differences; 
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third program code for providing multiplexers, 
coupled to said inverters and said 
subtraction logic, each for selecting as an 
output said packed difference if said 
associated carry bit indicates the packed 
difference is positive, and for selecting as 
said output said inverse if said associated 
carry bit indicates the packed difference is 
negative; and 

fourth program code for providing adders, coupled 
to said multiplexers, for adding said carry 
bits and said outputs of said multiplexers 
to generate a result of the instruction. 



